# Self-attention Optimization
Vidtome
MIT
A zero-shot video editing solution based on diffusion models, improving temporal coherence and reducing memory consumption by merging self-attention tokens across video frames.
Text-to-Video
V
jadechoghari
15
9
Pavit
MIT
PaViT is an image recognition model based on Pathway Vision Transformer, inspired by Google's PaLM, focusing on the application of few-shot learning techniques in image recognition tasks.
Image Classification Supports Multiple Languages
P
Ajibola
20
2
Featured Recommended AI Models